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1.  Introduction.  Many  problems  in  applied  mathematics  involve  an  integral  or  an 
expectation  of  a  convex  function.  For  example,  in  mechanics,  the  average  load  on  a  given 
structure  may  be  expressed  as  the  expectation  of  convex  functions  (or  linear  combinations 
of  convex  functions)  of  the  stresses.  In  finance,  the  present  value  of  an  option  on  a  stock 
may  be  the  expected  value  of  a  convex  function  of  the  future  value  of  the  stock  price.  In 
computer  systems,  the  performance  (throughput)  may  be  a  concave  (i.e.,  the  negative  of  a 
convex)  function  of  the  system  load.  The  average  performance  is  then  the  expectation  of 
this  performance  function.  Our  interest  has  been  in  functions  arising  in  stochastic  programs, 
mathematical  optimization  models  that  involve  random  variables. 

The  basic  problem  is  to  find: 

E/(*)  =  £{/(*(«))}  =  J  /(z(W))P(da,),  (1.1) 

where  x  is  a  random  vector  mapping  the  probability  space,  (fi ,A,P),  onto  (3?w ,  B N ,  F),  F 
is  the  distribution  function  of  x,  and  x  €  X  C  %iN .  The  expectation  functional,  E/(x),  can 
also  be  written  as  a  Lebesgue-Stieltjes  integral  with  respect  to  F: 

E  ,(*)=/  f(x)dF{x).  (1.2) 

Difficulties  arise  in  evaluating  E/(x)  when  either  the  function  /  is  difficult  to  evaluate 
(for  example,  requiring  a  complex  simulation  for  each  function  evaluation  as  in  the  computer 
system  example)  or  the  distribution  function  F  is  not  known  exactly  (for  example,  when 
the  demands  on  the  computer  system  are  not  known  beforehand).  Many  approximation 
formulas  for  integrals  (1.2)  have  been  given  (see  Davis  and  Rabinowitz  [8]),  but  they  either 
do  not  provide  efficient  error  bounds  or  impose  strict  differentiability  requirements  on 
the  integrand.  In  this  paper,  we  review  methods  for  evaluating  upper  bounds  on  E/(r) 
with  limited  distribution  information  and  convex  /.  In  addition,  we  provide  a  method  for 
evaluating  upper  bounds  on  a  large  class  of  convex  functions  on  9?1  given  first  and  second 
moment  information  about  the  distribution  of  x.  We  also  show  how  this  result  can  be 
extended  to  a  general  bound  for  convex  functions  on  .  The  bounds  extend  from  basic 
properties  of  convex  functions  and  from  a  generalized  moment  problem  interpretation  of 
the  problem. 
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Section  2  provides  background  on  previous  approaches  to  bounding  expectations.  The 
generalized  moment  problem  interpretation  is  given  in  Section  3.  Section  4  presents  basic 
results  and  some  examples  in  Hi1.  Section  5  provides  the  multi-dimensional  extension. 

2.  Background  and  previous  integral  approximations.  For  /(x)  =  x,  the 
expectation  functional  is  the  first  moment  or  mean  of  the  random  vector,  x,  with  respect 
to  the  distribution  function,  F.  For  f(x)  =  x‘,  the  expectational  functional  is  the  ith 
moment  of  x  with  respect  to  F.  A  function  /  :  X  — *  (-oo,+oo]  for  X  a  convex  set  is 
convex  on  X  if  and  only  if  /((l  -  A)x  +  Ay)  <  (1  -  A )/(x)  +  A /(y)  for  all  A  £  (0,1)  and 
every  x  and  y  in  X.  In  the  following,  we  assume  that  X  is  convex  and  that  every  convex 
function  /  on  X  is  proper,  i.e.,  there  exists  x  £  X  such  that  /(x)  <  +oo  and  /(x)  >  — oo 
for  all  x  £  X.  A  function  /  is  concave  on  X  if  -/  is  convex  on  X.  A  function  /  is  affine 
on  X  if  /  is  finite-valued,  convex  and  concave  on  X.  If  /(Ax)  =  A f(x)  for  all  A  £  (0,oo) 
and  x  £  $tN ,  then  /  is  positively  homogeneous  (of  degree  one).  A  positively  homogeneous 
function  /  :  5RN  — *  (-oo,+oo]  that  is  also  convex  on  *RW  is  sublinear.  A  sublinear  function 
associated  with  any  convex  function  is  the  recession  function,  f  0+ ,  defined  as  (see  [31]) 

(/0+)(y)  =  sup{/(x  +  y)  -  /(x)  |  /(x)  <  oo}.  (2.1) 

In  the  sequel,  we  also  consider  optimization  problems  of  the  form, 

sup/(x),  (2.2) 

z€C 

where  C  is  a  subset  of  a  linear  space.  A  point  x  is  feasible  in  (2.2)  if  x  6  C.  A  feasible 
point  x  is  an  extreme  point  of  C  if  there  does  not  exist  y  and  z  in  C  such  that  x  = 
(1  -  A)y+  Az,0  <  A  <  1  other  than  x  =  y  =  z.  A  point  x*  is  optimal  in  (2.2)  if  it  is  feasible 
and  /(x*)  =  supl€C  /(x).  In  this  case,  x*  attains  the  supremum  of  /  over  C.  When  the 
supremum  of  /  is  assumed  to  be  attained  for  some  x’  £  C ,  we  replace  “supremum”  by 
“maximum.” 

For  general  functions  /,  the  basic  procedures  to  approximate  E/(x)  use  some  form  of 
a  discrete  approximation  for  the  distribution  of  x.  For  X  =  [a,  6]  c  !Rl,  the  most  basic 
procedures  are  the  midpoint  and  the  trapezoidal  approximations.  If  one  interval  is  used, 


these  approximations  apply  to  a  uniform  distribution  on  [a,b] .  They  are  Mx{f)  =  /((a  + 
b)/2)  for  the  midpoint  and  Ti(/)  =  (/(a)  +  f(b))/2  for  the  trapezoidal  approximation.  The 
approximations  are  improved  by  dividing  [a,b]  into  subintervals,  appropriately  weighting 
the  subintervals  and  applying  Mi  and  7\  on  each  subinterval. 

A  more  sophisticated  procedure  is  gauaaian  quadrature  to  find  an  integral  formula  that 
fits  all  polynomials  up  to  some  degree.  As  noted  by  Miller  and  Rice  [29] ,  this  can  be  used  to 
find  a  discretization  with  N  values  that  matches  the  first  N  + 1  moments  of  the  distribution 
of  x.  To  match  the  first  three  moments  of  the  uniform  distribution  on  [a,b],  for  example, 
gaussian  quadrature  selects  two  points,  (a  +  b)/2±  (\/3/6)(6  -  a),  with  equal  probability, 
1/2. 


A  difficulty  with  using  the  gaussian  quadrature  formulas  is  that  they  do  not  generally 
provide  bounds  on  the  expectation.  Restrictions  on  higher-order  derivatives  and  Peano’s 
theorem  [30]  may  be  used  to  provide  bounds  but  that  requires,  at  least,  differentiability 
of  /  and  a  density  function  that  may  not  be  available.  Generalizations  of  the  mid-point 
and  trapezoidal  approximations  do,  however,  obtain  bounds  on  the  expectation  of  a  convex 
function.  Jensen’s  inequality  [21]  can  be  interpreted  as  a  generalization  of  the  mid-point 
approximation  that  provides  a  lower  bound  on  the  expected  value  of  convex  /  through  the 
following: 

f  f(x)dF(x)  >  f{  t  xdF(x)),  (2.3) 

J  St"  J  K» 

where  x  =  xdF(x)  is  assumed  finite. 

Madansky,  following  Edmundson,  ([27],  [12])  provided  a  generalization  of  the  trape¬ 
zoidal  approximation,  called  the  Edmundaon-Madansky  inequality,  that  gives  an  upper 
bound  on  the  expectation  of  a  convex  function.  For  N  =  1,  the  basic  inequality  is: 

((6  -  x)f(a)  +  (x  -  a)f(b)) 


L 


f(x)dF(x)  < 


(2.4) 


{b  ~  a) 

where  [a,  6]  is  a  finite  interval.  The  Edmundson-Madansky  inequality  (2.4)  can  also  be 
extended  to  multiple  dimensions  and  infinite  intervals  (see,  for  example,  [l],  [14],  and  [16]). 

Refinements  of  the  Jensen  and  Edmundson-Madansky  inequalities  are  possible  by 
subdividing  the  interval  (or,  more  generally,  the  region)  into  smaller  pieces  on  which  the 


bounds  can  be  reapplied  as  in  the  traditional  mid-point  and  trapezoidal  approximations 
(see  [3],  [15],  [19],  and  [22]).  These  refinements  require  additional  functional  evaluations 
and  conditional  expectations  on  the  subregions.  As  has  been  observed,  the  Jensen  lower 
bound  is  generally  reasonably  accurate  relative  to  the  Edmundson-Madansky  upper  bound 
(e.g.,  [17]),  which  requires  more  function  evaluations.  The  primary  concern  is  then  in 
obtaining  more  accurate  upper  bounds  without  additional  computational  effort.  This  paper 
shows  that  this  is  possible  for  a  class  of  convex  functions,  if  second  moment  information 
is  available.  The  next  section  provides  the  generalized  moment  problem  formulation  that 
leads  to  this  new  bound. 

3.  Generalized  moment  problem.  To  obtain  bounds  that  hold  for  all  distributions 
with  certain  properties,  we  can  find 


Q  €  P  a  set  of  probability  measures  on  (X,  BN)  subject  to 

/  Vi(x)Q(dx)  <  a,,i  =  l,...,s, 

Jx 

(3.1) 

/  Vi(x)Q(dx)  =  oti,i  =  8  +  l,...,Mt 
Jx 

to  maximize  /  f(z)Q(dz), 

Jx 

where  M  is  finite  and  the  are  bounded,  continuous  functions.  A  solution  of  (3.1)  obtains 
an  upper  bound  on  the  expectation  of  /  with  respect  to  any  probability  measure  satisfying 
the  conditions  above.  Problem  3.1  is  a  generalized  moment  problem  (  [25]).  When  the  v,  are 
powers  of  z,  the  constraints  restrict  the  moments  of  z  with  respect  to  Q.  In  this  context, 
(3.1)  determines  an  upper  bound  when  only  limited  moment  information  on  a  distribution 
is  available. 

Problem  3.1  can  also  be  interpreted  as  an  abstract  linear  program  since  the  objective 
and  constraints  are  linear  functions  of  the  probability  measure.  The  solution  is  then  an 
extreme  point  (see  [33]  for  a  discussion  of  properties)  in  the  infinite  dimensional  space  of 
probability  measures.  The  following  theorem,  proven  in  [23,  Theorem  2.1],  gives  the  explicit 
solution  properties. 


Theorem  1.  Suppose  X  is  compact.  Then  the  set  of  feasible  measures  in  (S.l), 
Q,  is  convex  and  compact  (with  respect  to  the  weak *  topology),  and  Q  is  the  closure  of 
the  convex  hull  of  the  extreme  points  of  Q.  If  f  is  continuous  relative  to  X,  then  an 
optimum  ( maximum  or  minimum)  of  Jx  f{x)Q(dx)  is  attained  at  an  extreme  point  of  Q. 
The  extremal  measures  of  Q  are  those  measures  that  have  finite  support,  { x t ,  ■  ■  ■  ,xL},  with 
L  <  M  +  1,  such  that  the  vectors, 


vi{xt) 

V2(ll) 


Vl[lL) 

Vi{xL) 


■vM{xL). 


(3.2) 


are  linearly  independent. 


Kemperman  [24]  showed  that  the  supremum  is  attained  under  more  general  continuity 
assumptions  and  provides  conditions  (which  we  assume  to  hold)  for  Q  to  be  nonempty. 
Dupaiova’s  (formerly  Baikova)  [9,  10,  36]  work  on  a  minimax  approach  to  stochastic  pro¬ 
gramming  led  to  the  use  of  the  moment  problem  as  a  bounding  procedure  for  stochastic 
programs.  She  showed  that  (3.1)  attains  the  Edmundson-Madansky  bound  (and  the  Jensen 
bound  if  the  objective  is  minimized)  when  the  only  constraint  in  (3.1)  is  vt  —  x,  i.e. ,  the 
constraints  fix  the  first  moment  of  the  probability  measure.  She  also  provided  some  prop¬ 
erties  of  the  solution  with  an  additional  second  moment  constraint  (v2(x)  =  x2)  for  a 
specific  objective  function  /.  The  general  problem  can  be  solved  using  the  generalized 
linear  programming  procedure  in  [7,  Chapter  24].  This  procedure  is  given  below. 


Generalized  Linear  Programming  Procedure  for  the  Generalized  Moment 
Problem  (GLP) 

Step  0.  Initialization.  Identify  a  set  of  L  <  M  +  1  linearly  independent  vectors  as 
in  (3.2)  that  satisfy  the  constraints  in  (3.1).  (Note  that  a  phase-one  objective  ([7])  may 
be  used  if  such  a  starting  solution  is  not  immediately  available.  For  N  =  1,  the  gaussian 
quadrature  points  may  be  used  as  mentioned  above.)  Let  v  —  L,  k  =  1,  go  to  1. 
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Step  1.  Master  problem  solution.  Find  pi  >  0, . . .  ,p„  >0  such  that 


5^V|(a:/)Pi  =  <*<,»  =  a  +  l,...,Af, and 


*  =  J2f(x,)pi  is  maximized. 


Let  {p^,...,p*}  attain  the  optimum  in  (3.3),  and  let  . irj,}  be  the  associated 

dual  multipliers  such  that 


0k  +£*N<(*i)  =  /(*i)>  if  p'  >  0.^  =  1. 


0*  +  >  /(x,),  if  p'  =  0,/  = 


**  >  0,i  =  l,...,s. 


Step  2.  Subproblem  solution.  Find  xt/+l  that  maximizes 


p(z,Ok ,vk)  =  f(z)  -  9k  -^7rkv,(z). 


If  p( xv+  1 , 0* ,  jrfc)  >  0,  let  i/  =  v  +  1,  k  =  k  +  1  and  go  to  1. 

Otherwise,  stop,  {p* , . . . ,  p*  }  are  the  optimal  probabilities  associated  with  {xi , . . . ,  xv  }  in 
a  solution  to  [5j. 


The  proof  of  the  convergence  of  CLP  is  given  in  [7,  Chapter  24].  This  result  is  used 
in  [13]  to  solve  a  class  of  problems  (3.1).  The  difficulty  in  GLP  is  in  the  solution  of 
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the  subproblem  (3.5),  which  generally  involves  a  nonconvex  function.  Birge  and  Wets 
[3]  describe  how  to  solve  (3.5)  with  constrained  first  and  second  moments,  if  convexity 
properties  of  p  can  be  identified.  Cipra  [6]  describes  other  methods  for  this  problem  based 
on  discretizations  and  random  selections  of  candidate  points,  x,. 

This  paper  considers  specific  conditions  so  that  (3.1)  can  be  solved  without  requiring 
the  repeated  nonconvex  optimization  in  (3.5).  The  goal  is  to  show  that  a  large  class  of 
problems  for  N  —  1  require  only  L  =  2  points  of  support  that  can  be  identified  in  one 
line  search,  that  the  points  of  support  are  analytically  calculable  for  certain  functions,  and 
that  the  solution  procedure  can  be  extended  to  a  general  class  of  functions  in  multiple 

dimensions.  The  next  section  describes  the  basic  results. 

1  o 

O  8 

06 

0  4 

0  2 

C 

Figure  1.  The  generalized  moment  problem  in  5R1 . 

4.  Two— point  support  functions.  In  this  section,  we  restrict  (3.1)  to  the  case 
of  N  —  1,  s  =  0,  and  M  =  2,  where  again  the  constraints  correspond  to  first  and  second 
moment  constraints.  To  distinguish  this  case  of  (3.1),  we  refer  to  it  as  Problem  (3. 1— (1 ,0,2)). 
The  problem  is  illustrated  geometrically  in  Figure  1.  Here,  X  =  [0,  .6]  and  C  —  the  convex 
hull  of  (x,z2,/(x))  for  x  E  X.  The  first  moment,  x,  and  the  second  moment,  x(2),  are 
shown.  The  objective  in  (3. 1— (1 ,0,2))  is  to  find  y*  =  (x,x(2),z*)  £  C  that  maximizes  z. 
A  generalization  of  Caratheodory’s  theorem  ([35])  for  the  convex  hull  of  a  connected  set 
tells  us  that  y*  can  be  expressed  as  a  convex  combination  of  at  most  three  extreme  points 
of  C ,  giving  us  a  special  case  of  Theorem  1.  Therefore,  an  optimal  solution  to  (3. l-(  1,0,2) 
can  be  written,  {x*,p*},  where  the  points  of  support,  x*  =  {x^x^xj}  have  probabilities, 


W,  W  I'  ^  HTY*  V»  HH.TM  HT  r  V’  K.»  Lw  t^HFW  HTwnui » .n  t’  L"A’M«m'«jmv«pi 


p*  =  {p\,P2,Pl}-  We  will  show  that,  for  a  broad  class  of  two-point  support  functions,  this 
can  indeed  be  further  restricted  to  two  points,  {xl,x2},  with  positive  probability. 

A  useful  result  of  the  linear  programming  interpretation  of  (3.1)  is  the  presence  of 
a  dual  problem  to  (3.1).  The  dual  to  (3.1)  is  known  as  a  semi-infinite  program  ([18]) 
that  appears,  for  example,  in  Chebyshev  approximation  ([32]).  For  the  one-dimensional, 
two-moment  constraint  problem  considered  here,  this  dual  is  to  find  0, 7Ti,ir 2  such  that 


0  +  ir ix  +  t2z2  >  /(z),Vz  G  X, 
and  0  +  ir^x  +  5r2z,2)  is  minimized. 


(4.1) 


Note  that  (4.1)  involves  three  variables  and  an  infinite  number  of  constraints  in  constrast  to 
the  infinite  dimensional,  finitely  constrained  primal  problem  (3.1-(1,0,2)).  Note  also  that  an 
optimal  solution  to  (4.1)  is  a  quadratic  function  that  dominates  /  and  that  has  minimum 
expectation  with  respect  to  any  probability  measure  in  Q. 


The  optimality  conditions  on  a  feasible  solution  to  (3.1),  z*  =  {x\,x2,z2},  with  asso¬ 
ciated  probabilities,  p*  =  {pj,p2,p5},  are  that  there  exist  dual  variables,  0*,7rJ,7r2,  such 
that 


r  +  ir[x‘  +  tt*(z*)2  =  /(z*)  if  p*  >  0, 
9*  +  TTj  z  +  ?r2  z2  >  /(z),Vz  G  X , 


(4.2) 


where  the  first  condition  is  known  as  complementary  slackness  condition  and  the  second 
condition  is  dual  feasibility.  A  useful  interpretation  of  these  conditions  in  terms  of  the 
function  p  defined  in  (3.5)  is  that  z*  has  a  positive  probability,  p* ,  only  if  p(z* ,  0' ,  n‘ )  =  0 
and  z*  maximizes  p  over  X  for  fixed  (0* ,  jt*  ).  It  is  convenient  to  let  p[x,  0,  it)  —  f[x)-q{x), 
where  q(x,x,0)  —  0  +  ir^x  +  ir2z2 .  The  following  lemmas  give  additional  properties  for  an 
optimal  solution  {z*,p*}  to  (3.1-(1,0,2)).  We  assume  that  /  is  always  convex  below.  The 
first  lemma  refers  to  feasible  solutions  of  (3. 1— (1 ,0,2)) . 


Lemma  1.  If  a  feasible  solution  to  (S.l-(l,0,2))  is  obtained  with  positive  probability 
support  points,  <  z2  <  z3,  then  another  feasible  solution  exists  with  support  points,  z, 
and  z4,  where  z2  <  z4  <  *3- 
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Proof:  This  result  is  straightforward.  Feasibility  of  {xj, 12,13}  implies  there  exists 
{P1.P2.P3}  >  0  such  that  ELi  Pi(zi>x>)  =  {*,  ' )  or  +  Pz^z  ~  xl3xl  -  x2)  + 

P3  (x3  -  *1,  x\  —  x2)  =  (x,  x(2>).  This  implies  that  the  line  (x!  +  t(x  -  x{ ),  x\  +  f(x,2)  -  x\ )) 
is  a  convex  combination  of  the  lines,  (i!  +  f(x,  -  x{ ),  x2  +  f(x2  -  x2), »  =  2,3.  All  three  lines 
intersect  (x,x2)  at  (li,!2).  The  convexity  property  implies  (xj +t(x-x1),x2+t(x(21  -x2)) 
intersects  (x,x2)  at  x4  such  that  x2  <  x4  <  x3.i 

The  next  lemma  considers  /  with  a  derivative  f  that  has  local  convexity  or  concav¬ 
ity  properties.  These  properties  form  the  basis  for  the  bounding  approach  given  in  this 
paper. 

Lemma  2. If  p(x,  6,  ir)  =  0  for  some  (6,ir)  feasible  in  (f.l),  x  is  in  the  interior  of  X, 
and  f  is  convex  or  concave  on  A  -  (x  -  r?,x)  or  B  =  (x,x  +  r) )  for  some  r]  >  0,  then, 
there  exists  6  >  0,  such  that,  for  any  e  >  0,e  <  6,  f'(x  —  e)  >  q'(x  -  e,0,n)  for  f  convex 
or  concave  on  A  and  /'(x+  e)  <  q'(x  +  e,6,  n)  for  f  convex  or  concave  on  B. 

Proof:  Let  p(-)  =  r)  and  let  q(-)  =  q(-,9,n).  Note  that  q  is  differentiable  on  X 

and  that,  for  /  convex,  /  is  contmously  differentiable  on  all  but  a  countable  number  of  points 
in  X.  Since  p  is  then  densely  differentiable,  there  exist  right  and  left  open  neighborhoods 
of  x  on  which  p  is  differentiable.  The  following  limits  are  well-defined 

p'~  (£)  =  \\mp'(x  -  t);  p'+  (x)  =  limp'(x  +  «).  (4.3) 

*iO  tiO 

If  p(x)  ==  0  for  feasible  (9,  fr)  in  (4.1),  then  x  maximizes  p.  Therefore,  0  S  \p'~  (x),p'+  (x)] . 
By  the  local  convexity  (or  concavity)  assumption  for  /'  around  x,  there  exist  intervals 
(x-t,  x)  or  (x,  x+t)  ( t  >  0)  over  which  p'  has  constant  sign.  Suppose  /'(  x-c)  <  ?'(x-e,M) 
for  all  0  <  e  <  t  and  /'(x  -  s)  <  q'(x  -  s,9,i r)  for  some  0  <  s  <  t,  then  p(x)  <  p(x  -  t ), 
contradicting  the  maximality  of  x.  A  similar  argument  holds  for  (x,x  +  t),  proving  the 
result! 

The  previous  lemma  considers  local  convexity  properties  of  /'  when  it  exists.  The 
following  results  refer  to  functions  with  derivatives  that  are  convex  and  then  concave. 
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Lemma  3.  Let  g(x)  =  h(x)  —  c(x)  4e  a  function  such  that  g{x)  is  increasing  on  9i,  h  is 
convex  on  (-00,  y)  and  concave  on  (y,  00),  and  c(x)  is  an  affine  function  on  ?R.  Then  there 
exists  a  partition  of  (  —  00,00)  into  subintervals,  Iy  =  (-00,0!),  I2  =  \ay,a2),  I2  =  [02,03), 
/4  =  [a3i+°°),  —  00  <  Oi  <  a2  <  a3  <  00,  such  that  g(x)  >  0  for  all  x  £  ly  U  /3  and 
g(x)  <  0  for  all  x  £  I2  U  /4. 


Proof:  First  note  that  g  is  continuous  on  (  —  00,  c)  and  (c,  00).  Also,  g  cannot  change 
sign  from  negative  to  positive  twice  on  (-00,  c)  by  the  convexity  of  h  in  that  region.  If  g 
has  two  sign  changes  on  (-oo,  y) ,  then  there  exists  intervals,  Iy ,  /2 ,  /3  such  that  g  is  positive 
on  Iy  U  I3  and  negative  on  I2.  For  g  increasing,  g(y)  >  0.  By  h  concave  on  (y,  00) ,  g  has  at 
most  one  sign  change  on  (y,  00),  giving  the  result  when  g  has  two  sign  changes  on  (-00,  y). 
A  similar  argument  applies  if  g  has  one  or  no  sign  changes  on  (  — oo,c).i 

The  next  lemma  considers  the  case  where  g  =  p'  is  constant  on  an  interval.  In  the 
following,  we  use  the  notation  X  =  [a,  6]  for  convenience.  It  is  assumed  that  this  also 
includes  the  cases  X  =  (  —  00,6],  X  =  [a, +00),  and  X  =  (  —  00,  +00)  unless  explicitly  stated 
otherwise. 

Lemma  4.  If  f  is  convex  on  X  —  [a,  4]  with  derivative  f  defined  as  a  convex  function 
on  (a,c)  and  as  a  concave  function  on  (c,4)  for  a  <  c  <  b  and  if  p(x,9,n)  =  0  for  some 
[6,  it)  feasible  in  (4-1)  and  for  all  x  6  (x  -  e,  x  +  e)  for  some  x  €  X  and  e  >  0,  then  there 
exists  an  interval  D  D  (x  -  e,  x  +  e)  such  that  p(x,  9,  it)  =  0  for  all  x  E  D  and  p(x,  9,  it)  >  0 
for  all  x  the  closure  of  D. 

Proof:  Let  D  —  {d,e)  be  the  largest  interval  including  (x  -  e,  x  +  e)  such  that  p(x)  =  0 
for  all  x  E  D.  By  Lemma  2  and  the  assumption,  if  d  >  a,  there  exists  an  interval,  ( d  —  rj,d), 
and,  if  e  <  4,  there  exists  an  interval  (e,e  +  rj),  for  rj  >  0,  such  that  p'  >  0  on  {d-  rj,d)  and 
p'  <  0  on  (e,e  +  y).  By  the  convex-concave  assumption,  f  is  convex  on  [a,  d)  and  concave 
on  (e,4]  since  q'  is  affine.  Hence,  p'  would  be  strictly  positive  on  [ a,d )  and  strictly  negative 
on  (e,4].  This  would  imply  that  p  is  negative  on  (e, 6] ,  violating  the  feasibility  of  (0,  ir)  in 
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(4.1).  Hence,  e  =  b,  giving  the  result.* 


The  convex-concave  property  is  now  used  to  derive  our  main  result  about  two-point 
support  functions. 

Theorem  2.  If  f  ia  convex  with  derivative  f  defined  aa  a  convex  function  on  (a,c) 
and  aa  a  concave  function  on  ( c,b )  for  X  —  [a,  6]  and  a  <  c  <  b,  then  there  exista  an 
optimal  solution  to  (S.l-(1,0,2))  with  at  moat  two  support  points,  {xy ,  i2 },  with  positive 
probabilities,  {pi,p2}- 

Proof:  Let  {0,  ir)  be  an  optimal  solution  to  (4.1).  First,  assume  that  there  does  not 
exist  e  >  0,  x  E  (a,  b)  such  that  p(x,  $,n)  =  0  for  all  x  E  (x  -  e,  x  +  e).  By  Lemmas  1  and  3, 
the  only  isolated  points  where  p  could  be  0  and  maximized  are  ax  and  a3  if  [a,  6]  D  /2  U  /3. 
If  [a,  6]  ~f>  /2,  then  a  can  replace  and  if  [a, 6]  /3,  then  b  can  replace  a3,  but,  in  either 

case,  at  most  two  points  meet  the  conditions  for  optimality. 

If  there  exists  e  >  0,  x  E  (a,  6)  such  that  p(x,  0,7r)  =  0  for  all  x  E  (x  -  e,  x  +  e),  then 
Lemma  4  implies  that  any  optimal  solution  {x1,i2,i3}  must  be  in  the  closure  of  D  and 
that  the  p(x)  =  0  for  all  x  E  D.  By  Lemma  2,  we  can  select  z«  in  (x2,x3)  such  that  there 
exists  {pi,p*}  so  that  {x1,X4,pi,p4}  is  feasible  in  (3.1-(1,0,2)).  The  optimality  conditions 
still  hold  for  p(x4)  =  0.  Hence,  {xi,z4,Pi,p4}  is  optimal  in  (3.1-(l,0,2)).i 

A  corollary  of  Theorem  2  is  that  any  function  /  that  has  a  convex  or  concave  derivative 
has  the  two-point  support  property.  The  class  of  functions  that  meets  the  criteria  of 
Theorem  2  contains  many  useful  examples.  Some  of  these  functions  are  given  below: 

1.  Polynomials  defined  over  ranges  with  at  most  one  third  derivative  sign  change. 

2.  Exponential  functions  of  the  form,  c0eClI,c0  >  0. 

3.  Logarithmic  functions  of  the  form,  logk(cx),  for  any  k  >  0. 

4.  Certain  hyperbolic  functions  such  as  sinh(cx),c,x  >  0 ,cosh(cx). 

5.  Certain  trigonometric  functions  such  as  tan'1  (cx),c,x  >  0. 
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In  fact,  Theorem  2  can  be  applied  to  provide  an  upper  bound  on  the  expectation  of  any 
convex  function  with  known  third  derivative  when  the  distribution  function  has  a  known 
third  moment,  z*3).  Suppose  a  >  0  (if  not,  then  this  argument  can  be  applied  on  [a,0] 
and  [0,6]),  then  let  g(x)  —  ax3  +  /(z).  The  function  g  is  still  convex  on  [0,6)  for  a  >  0. 
By  defining  a  >  (-1/6)  min(0,  infx£|a  ^  f"'(x)),g'  is  convex  on  [a, 6],  and  an  upper  bound, 
U B(g),  on  E„(z)  has  a  two-point  support.  The  expectation  of  /  is  then  bounded  by 

E/(z)  <  U B(g)  -  az(3).  (4.4) 

The  conditions  in  Theorem  2  are  only  sufficient  for  a  two-point  support  function.  They 
are  not  necessary.  The  following  function,  for  example,  has  an  optimal  two-point  support 
at  z*  =  {1/3,1}  for  any  corresponding  feasible  pi  and  p2  when  X  =  [0,1]. 


'  6/5  -  4z  +  5z2  if  0  <  x  <  .2, 

-2z  +  1  if  .2  <  z  <  .4, 

1  -  (2/5)z  -  8z2  +  10z3  if  .4  <  z  <  .6, 

,  1  -  4z  +  4z2  if  .6  <  x  <  1. 


(4.5) 


The  function  defined  in  (4.5)  does  not,  however,  meet  the  conditions  of  Theorem  2. 

Note  also  that  not  all  functions  are  two-point  support  functions  (although  bounds  such 
as  (4.4)  are  generally  available).  A  function  requiring  three  support  points,  for  example,  is 
f{x)  =  (1/2)  -  1/4)  -  (z  -  (1/2))2.  This  function  and  its  optimal  dominating  quadratic 

function  are  illustrated  in  Figure  2. 


Figure  2.  A  function  requiring  three  support  points. 
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Given  that  a  function  is  a  two-point  support  function,  the  points  {xx ,  x2  }  can  be  found 
using  a  line  search  to  find  a  maximum.  For  example,  if  some  candidate  xx  <  x  is  given, 


then  a  feasible  corresponding  x2  is 


*2  = 


x(2)  —  XXi 
X  -  Xi 


(4.6) 


where  px  =  (x2  —  x)/(x2  -  xx)  and  p2  =  1  —  p2.  Note  that  the  problem  is  obviously  not 
feasible  if  (zi,x2)  yf  x.  The  solution  of  (3. 1— (1 ,0,2))  then  reduces  to  maximizing  : 


7(*i)  =  Pi(ii)/(xi)  +P2(ii)/(*2(zi))  subject  to  xx  e  [a,x).  (4.7) 


A  line  search  to  find  the  maximum  in  (4.7)  can  be  performed  efficiently  using,  for  example, 
Lemarechal  and  Mifflin’s  procedure  in  [26]  if  7  6  C2  or  Mifflin  and  Strodiot’s  [28]  method 
without  derivatives.  Table  1  gives  the  values  (under  “2-M”)  that  were  obtained  by  this 
procedure  for  three  two-point  support  functions  with  distibutions  on  [0,1].  Figures  3-5 
illustrate  that  the  optimal  points,  {ij.ij},  may  be  at  either  endpoint  or  interior  to  [0, 1]. 
The  table  gives  the  expectation  for  a  random  variable  with  beta  distribution  (under  “Beta”) 
with  the  given  first  and  second  moments  for  comparison.  The  Edmundson-Madansky  upper 
bound  (“E-M”)  is  also  provided.  The  “S-L”  value  (for  “semi-linear  bound”)  given  in  Table 
1  is  discussed  below. 


Table  1.  Bounding  values 


Function 

X 

X<2> 

Beta 

2-M 

S-L 

E-M 

e~* 

HI 

0.333 

0.622 

EB3K 

mmm 

i3 

US 

0.714 

0.625 

i mm 

mm 

mm 

s«n(j r(z  +  1)) 

+  1 

m 

0.333 

0.363 

H 

EH 

m 

0 


0.2 


04 


00 


oa 


Figure  4.  Optimal  bounding  function  for  x3 . 

A  line  search  is  not  necessary  for  solving  (3. 1— (1,0,2))  in  certain  cases.  The  support 
points  can  be  calculated  analytically  for  semi-linear,  convex  functions  that  are  defined  by 


f(x)=  /r(c-x)  ifx<c, 
J(  )  U  +(*~c)  if  x>c, 


where  q+  +  q~  >  0.  These  functions  clearly  meet  the  conditions  of  Theorem  2  and,  hence, 
require  at  most  two  points  of  support.  The  analytical  results  depend  on  the  interval, 
[a,  6].  If  [a,  6]  =  [0,1],  then  consider  the  nonintersecting  intervals,  A  =  (0,i(2)/(2x)), 
B  =  [x(2)/(2x), (1  -  x(2))/(2(l  -  x))],  and  C  =  ((1  -  x(2))/(2(l  -  i)),l).  The  points  of 
support  for  a  semi-linear,  convex  function  defined  on  [0, 1]  are 


= 


{0,x(2)/i}  if  c  €  A, 

{c  -  d,c  +  d}  if  c  €  B, 

{(x-x<2>)/(l-x),l}  if  c  €  C, 


(4.8) 
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Figure  5.  Optimal  bounding  function  for  sin(x(x  +  1))  +  1. 


Infinite  intervals  can  also  be  solved  analytically  for  semi-linear,  convex  functions.  For 
X  =  [0,oo)  ,  the  results  are  as  in  (4.8)  with  B  =  [x*2*/(2x),oo)  and  C  =  0.  For  the  interval 
(-oo,  oo),  the  points  of  support  are  those  for  interval  B  in  (4.8).  We  note  that  special  cases 
for  these  supports  of  semi-linear,  convex  functions  were  considered  in  [9],  [20],  and  [34]. 

Semi-linear,  convex  functions  are  common  in  decision  problems  to  represent  penalties 
for  being  above  or  below  a  preferred  value,  c.  They  can  also  be  used,  however,  to  provide 
bounds  for  other  convex  functions  when  only  the  first  and  second  moments  of  the  distri¬ 
bution  function  are  known.  For  example,  the  following  function  dominates  any  convex 
function  on  [0, 1], 

v(x  c\  _  /  (c  -  *)((/(0)  -  /(c))/c)  +  /(c)  if  i  <  c,  , 

K  ,  )  "  l  (x  -  c)((/(l)  -  /(«))/( 1  -  e))  +  /(c)  if  x  >  c,  (49) 

where  c  £  (0,1)-  The  function  v  is  a  semi-linear,  convex  function  for  any  convex  function, 
/.  The  values  of  v  at  the  points  in  (4.8)  can,  therefore,  be  used  to  bound  /  v(x)Q(dx)  > 
/  fix)Q(dx)  f°r  any  Q  that  is  feasible  in  (3.1-(1,0,2)).  The  results  of  using  u  to  bound 

/(z)  =  e~x,  x3  and  s«n(jr(z  +  1))  +  1  are  given  in  Table  1  as  mentioned  above.  More 

extensive  results  in  [11]  indicate  that  this  approximation  is  accurate  for  many  functions 
and  moment  values. 

Semi-linear,  convex  functions  can  also  be  defined  to  dominate  functions  defined  on 
unbounded  ranges  if  those  functions  have  finitely  valued  recession  functions  in  each  direc- 


tion.  If  /0+(y)  <  oo  for  all  y  €.  (-oo,oo),  then  there  exists  some  c  such  that  /(c)  <  oo  and 
v[x ,c)  =  (/0+)(x-  c)  +  f(c)  >  f(x)  for  all  x.  The  corresponding  formulas  for  semi-linear, 
convex  functions  on  unbounded  ranges  can  then  again  be  used  to  bound  E/(x)  for  any 
feasible  distribution  that  meets  the  conditions  in  (3. 1-(1 ,0,2)) .  Bounds  on  the  expectation 
of  sublinear  functions  are  especially  computable  because  ( /0+)(y )  =  f(y)  for  sublinear  / 
by  definition.  The  next  section  discusses  how  to  use  these  bounds  in  multiple  dimensions. 

5.  Bounds  in  multiple  dimensions.  The  use  of  the  generalized  programming 
formulation  is  limited  in  multiple  dimensions  because  of  the  difficulty  in  solving  the  sub¬ 
problem  (3.5).  Another  difficulty  is  that,  even  with  a  bound  only  on  the  first  moment  of  the 
distribution  function  and  constrained  cross-moments,  E[x,xy],»  ^  j,  the  moment  problem 
solution  ([14])  involves  positive  weights  on  2N  extreme  points  of  X  when  X  is  a  compact 
set  in  5 iN  (and  a  corresponding  number  of  recession  function  evaluations  when  X  is  not 
compact  ([5])).  These  computational  disadvantages  for  large  values  of  N  suggest  that  a 
looser  but  more  computationally  efficient  upper  bound  on  the  value  of  (3.1)  may  be  more 
useful  than  solving  (3.1)  exactly  for  large  N . 

If  a  separable  function,  i '(x,c)  =  E<=i  */»(I(0>c(0)>  *s  available,  it  offers  an  ob¬ 
vious  advantage  by  only  requiring  single  integrals.  In  this  case,  we  would  like  to  find 
v(x,c)  =  ^  fix )  where  each  j/j(x(i'),c(«))  is  a  semi-linear,  convex  func¬ 

tion.  Methods  for  constructing  these  functions  to  bound  the  optimal  value  of  a  linear 
program  with  random  right-hand  side  are  discussed  in  [2]  and  [4].  In  these  stochastic 
linear  programs, 

f(x)  =  min  {qy  \  Ay  =  x },  (5.1) 

y€  + 

where  q  6  3?"  and  AG  5RnxJV  are  known  parameters  of  the  problem.  Note  that  /  defined 
in  (5.1)  is  sublinear. 

The  functions  i \  are  found  by  solving  for 

qf  =  min  {qy  |  Ay  =  ±e,},  (5.2) 

y€  * 
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where  is  the  tth  unit  vector  in  9?^ .  We  then  have 


if  Xi  >  0, 
if  i,  <  0, 


(5.3) 


whose  expectation  can  now  be  bounded  using  the  support  points  found  in  Section  4.  Note 
that  this  bound  requires  2 N  function  evaluations  instead  of  the  exponential  number  of 
function  evaluations  required  in  the  Edmundson-Madansky  bound. 


Different  values  of  c<  from  0  in  (5.3)  are  used  when  a  deterministic  vector  also  appears 
in  the  right-hand  side  of  the  linear  program  in  (5.1)  (i.e.,  the  constraints  are  Ay  —  x  -  t  for 
some  deterministic  vector  t).  More  precise  bounding  functions  for  /  in  (5.1)  are  possible 
for  distributions  on  compact  regions  (  [2]).  Linear  transformations  can  be  used  to  obtain 
other  separable,  semi-linear  convex  upper  bounding  functions  (  [4]). 


6.  Conclusions.  This  paper  describes  how  to  bound  the  expectation  of  convex 
functions  when  only  limited  distributional  information  is  available.  Given  the  difficulties  in 
estimating  random  pnenomenon,  limited  information  in  terms  of  bounds  on  the  mean  and 
second  moments  of  distributions  is  a  general  practical  situation.  The  bounds  provided  in 
this  paper  allow  for  efficient  computation.  We  note  that  these  procedures  can  be  extended  to 
lower  bounds  on  the  expectation  of  concave  functions  obviously.  Since  the  Jensen  inequality 
is  the  solution  of  the  generalized  moment  problem  to  minimize  the  expectation  of  a  convex 
function  subject  to  a  first  moment  equality  constraint  and  an  upper  bound  on  the  second 
moment  constraint,  upper  and  lower  bounds  are  computable  on  the  expectation  of  general 
functions  that  can  be  expressed  as  linear  combinations  of  convex  and  concave  functions. 
Given  this  extension  and  the  use  of  upper  bounding  separable,  semi-linear  convex  functions 
as  in  Section  5,  the  two-point  support  bounds  apply  to  a  wide  range  of  problems. 
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